Class Noise Mitigation Through Instance Weighting
نویسندگان
چکیده
We describe a novel framework for class noise mitigation that assigns a vector of class membership probabilities to each training instance, and uses the confidence on the current label as a weight during training. The probability vector should be calculated such that clean instances have a high confidence on its current label, while mislabeled instances have a low confidence on its current label and a high confidence on its correct label. Past research focuses on techniques that either discard or correct instances. This paper proposes that discarding and correcting are special cases of instance weighting, and thus, part of this framework. We propose a method that uses clustering to calculate a probability distribution over the class labels for each instance. We demonstrate that our method improves classifier accuracy over the original training set. We also demonstrate that instance weighting can outperform discarding.
منابع مشابه
Combining Instance Weighting and Fine Tuning for Training Naïve Bayesian Classifiers with Scant data
This work addresses the problem of having to train a Naïve Bayesian classifier using limited data. It first presents an improved instance-weighting algorithm that is accurate and robust to noise and then it shows how to combine it with a fine tuning algorithm to achieve even better classification accuracy. Our empirical work using 49 benchmark data sets shows that the improved instance-weightin...
متن کاملModified RWGH and Positive Noise Mitigation Schemes for TOA Geolocation in Indoor Multi-hop Wireless Networks
Time of arrival (TOA) based geolocation schemes for indoor multi-hop environment are investigated and compared to some of conventional geolocation schemes such as least squares (LS) or residual weighting (RWGH). The multi-hop ranging involves positive multi-hop noise as well as non-line of sight (NLOS) and Gaussian measurement noise, so that it is more prone to ranging error than one-hop range....
متن کاملA Co-evolutionary Framework for Nearest Neighbor Enhancement: Combining Instance and Feature Weighting with Instance Selection
The nearest neighbor rule is one of the most representative methods in data mining. In recent years, a great amount of proposals have arisen for improving its performance. Among them, instance selection is highlighted due to its capabilities for improving the accuracy of the classifier and its efficiency simultaneously, by editing noise and reducing considerably the size of the training set. It...
متن کاملEP-based robust weighting scheme for fuzzy SVMs
Support vector machine (SVM) classifiers represent one of the most powerful and promising tools for solving classification problems. In the past decade SVMs have been shown to have excellent performance in the field of data mining. The standard SVM classifier treats all instances equally. However, in many applications we have different levels of confidence in different instances that belong to ...
متن کاملBecoming More Robust to Label Noise with Classifier Diversity
It is widely known in the machine learning community that class noise can be (and often is) detrimental to inducing a model of the data. Many current approaches use a single, often biased, measurement to determine if an instance is noisy. A biased measure may work well on certain data sets, but it can also be less effective on a broader set of data sets [1]. In this paper, we present noise iden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007